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ABSTRACT 

Learning hierarchies are networks of prerequisite 
relationships of instructional objectives. Seven measures of the 
validity of learning hierarchies were compared for their ability to 
identify correctly- and incorrectly-ordered hierarchies. A computer 
simulation model was used to generate stochastic data of known 
underlying structure. Analysis of variance processing of the data 
indicated that three of the measures provide stringent but useful 
tests of hierarchy validity. (Author/BW) 



♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦# 

♦ Documents acquired by ERIC include many informal unpublished ♦ 

♦ materials not available from other sources. ERIC makes every effort ♦ 

♦ to obtain the best copy available. Nevertheless, items of marginal ♦ 

♦ reproducibility are often encountered and this affects the quality ♦ 

♦ of the microfiche and hardcopy reproductions ERIC makes available * 

♦ via the ERIC Document Reproduction Service (EDRS).. EDRS is not ♦ 

♦ responsible for the quality of the original document. Reproductions ♦ 

♦ supplied by EDPS are the best that can be made from the original. ♦ 
*♦♦*********♦*******♦*♦***♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦*♦♦♦*♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 



EKLC 



oo 



A COMPUTER SIMULATION STUDY OF MEASURES 
FOR VALIDATING LEARNING HIERARCHIES 

by 

A.B. Durell^ 
The Ontario Institute for Studies in Education 



us OSPAHTMENTOF MSALTM. 
COUCATION * WCLFARC 
NATIONAL INSTITUTE OF 
COUCATION 

THIS OOCUME^JT MAS BEEN REPRO- 
OUCEO EXACTLY AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIGIN- 
ATING IT POINTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRE- 
SENT OF F IClAL NATIONAL INSTITUTE OF 
EDUCATION POSitlON OR POLICY 



00 



Paper presented at the Annual Meeting 

o£ the 

American Educational Research Association 
Chicago, April 1974 



1. Now at Department of Educational Psychology, Faculty of 
Education, University of Toronto. 



ERIC 



Problem 

Learning hierarchies (Gagne^ 1965) are networks of 
prerequisite relationships of instructional objectives. Use 
of the term seems to have followed from the work of Gagne' 
and Paradise (1961) • Designers of systematic approaches to 
individualized instruction (Bolvin, 1968) , diagnostic and 
achievement testing (Glaser & Nitko, 1971) and mastery 
learning (Bloom, 1971) often find the learning hierarchy 
concept to be useful. Computer-based instructional systems 
are often based on learning hierarchies. Examples may be 
found in computer-managed instruction (Sass, 1971), computer- 
based testing (Ferguson, 1970) , and computer-assisted 
instruction (Hicks & Hunka, 1972), 

The most useful approach to hierarchy generation is to 
begin with the terminal objective of an instructional sequence 
and ask the question "What would the learner have to be able 
to do in order to attain this objective?" as suggested by Gagne 
(1969). In this manner, behaviours prerequisite to performance 
of the terminal objective are identified. The question is repeated 
for each of the subordinate behaviours. Repetitive application 
of this heuristic generates a hierarchy of behaviours. The 
process is continued until it is reasonable to assume that the 
subordinate behaviours identified will be in the repetoires 
of all learners to be instructed. 
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Use of the foregoing procedure does not guarantee 
production of a hierarchy that is valid. That is, a network 
of objectives generated by logical analysis may prove to be 
pedegogically ineffective. Resnick (1973) distinguished 
between psychometric and transfer interpretations of 
hierarchy validity. Most attempts to construct learning 
hierarchies are grounded on a desire to identify 
prerequisite relationships which will provide transfer value 
between objectives in the hierarchy. However, most measures 
of hierarchy validity are based on psychometric evidence. 
Tests of transfer involve instructional intervention and the 
explicit comparison of the relative effectiveness of 
different orders of presenting instructional material. 
Psychometric measures use the performance patterns of 
learners for a hypotnesized hierarchy to test for the 
dependency relations which should exist if certain 
objectives truly are prerequisite to others. John Carroll, 
in the "Comments of Discussants" of the Resnick (1973) 
symposium, noted that psychometric evidence of hierarchical 
relationship is no guarantee that there is transfer value 
from one objective to another. However, psychometric 
measures do constitute a necessary though not sufficient, 
condition for transfer to exist. Carroll suggested that 
psychometric indications of hierarchy validity are of value 
as hueristic devices in searching for hierarchies to test 
for transfer value. .The transfer test of heirarchy validity 
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is an exacting one which takes a good deal of time. 
Therefore, psychometric measures of hierarchy validity are 
significant even though they are insufficient to certify the 
pedagogical worth of a hierarchy. All of the measures 
considered in this study are psychometric measures. 

The purpose of this study is to consider the 
effectiveness of several psychometric measures of hierarchy 
validity in detecting correctly and incorrectly sequenced 
objectives. There is a practical difficulty in the way of 
performing such a test with any given hierarchy of 
instructional objectives. What is desired is to test the 
ability of various measures to indicate whether or not a 
hierarchy is valid. However, one cannot know if the 
hierarchy which is used to test the measures is valid. If it 
were possible to know if the hierarchy were valid, there 
would be no need for the measures. In this study a model of 
learning hierarchically related material was formulated and 
used as the basis of a computer program to generate data 
simulating that which might be produced by learners. The 
use of a model made it possible to specify the underlying 
structure of the data. In particular, it was possible to 
decree in advance that the hierarchy for which data would be 
generated was or was not valid. 

Measures Considered 
The prerequisite relationships which are assumed to 
exist in a hierarchy have suggested the use of scalogram 



analysis (Guttman, 1944) and multiple scalogram analysis 
(Lingoes, 1963) as measures of hierarchy validity. One of 
the major difficulties in applying scaling techniques to 
hierarchy validation is that scaling techniques only 
indicate linear relationsi\ips while most hierarchies involve 
branches. Resnick and Wang (1969) found scaling procedures 
awkward to apply to a branched hierarchy. 

Another class of measures, less mathematically 
sophisticated than scaling techniques, can be identified in 
the literature. This class of measures may be characterized 
as step-by-step measures as they use data concerning the 
mastery or non-mastery of objectives by learners to 
calculate numerical values associated with every transition 
from one hierarchy level to another. These iraasures do not 
produce any overall score of validity for an entire 
hierarchy, as scaling procedures do. 

In the minimal hierarchy in Figure 1, mastery of 
objective B is assumed to be a prerequisite of mastery of 
objective A. 

The possible patterns of results which a learner might 
produce for these two objectives are shown in Table 1. A + 
indicates mastery and a - indicates non-mastery . The 
pattern of results is represented by an ordered pair such as 
(-+) . The pattern (-+) indicates non-mastery of the higher 
level objective (A) and mastery of the lower level objective 
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Objective A 
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Objective B 



Level 2 



Level 1 



Figure 1: A two-objective hierarchy 



TABLE 1 



POSSIBLE RESPONSE PATTERNS FOR A LEARNER 
ON TWO OBJECTIVES 
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In the calculational formulae to be presented for the 
various measures, the response patterns will be used to 
represent the number of learners demonstrating that 
particular pattern. The proportion of responses fitting, 
say, the {++) pattern will be symbolized by P (++) . Thus, 

(++) 

P ^++) = (++) + (+-) + + (— ■) 

The six step-by-step measures selected for 

consideration in this study constitute all such measures 

suggested in the literature. The name of each measure, an 

abbreviation for future reference, its calculational 

formula, its originator, value range, and criterion value 

will be briefly stated. 

1. Proportion of Positive Transfer (PPT) . This measure 
was proposed by Gagne' and Paradise (1961).. The formula for 
PPT is 



(++) + (— ) 
(++) + (— ) + (+-) 



PPT has a range of values from 0 to 1 and its criterion 
value is .90. That is, if the value of PPT calculated 
between any two objectives, for which a prerequisite 
relationship is hypothesized, is greater than or equal to 
.90, then PPT indicates that a hierarchical relationship 

exists . c 

2. Order Ratio (OR). Phillips (1971) devised the order 
ratio by adding the number of (-+) response patterns to the 
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numerator and denominator of PPT. Thus the formula for OR 
is 

, (++) + (-■^) + (-+) 

(++) + (— ) + (-+) + (+-) 

OR has a range of values from ) to 1 and a criterion value 
of .90. The apparent complexity of the formula for OR may 
obscure the fact that it is equal to 

1 - P(+-) . 

3. Eisenberg-Walbesser Ratio s. Three ratios, each of 
which tests for some desirable property of response patterns 
in a hierarchical relationship were proposed by Eisenberg 
and Walbesser (1971) . All three ratios have a range of 
values from 0 to 1. It was suggested that a hierarchical 
relationship does not exist unless the values of all three 
ratios are greater than or equal to .85. For the purposes 
of this study, these three ratios were regarded as 
components of a single ratio called the Combined 
Eisenberg-Walbesser Ratio ,(CEW) . CEW is equal to the 
minimum of the three component ratios and has a criterion 
value of .85. 

The three component ratios and their formulae are: 



Consistency Ratio 

(++) 
(++) + (+-) 
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Adequacy Ratio 

(++) 
(++) + (-+) 

Completeness Ratio 

(++) 
(++) + (~) 

4. Phi (PHI). Phillips (1971) used a phi coefficient 
as an indicator of hierarchical relationship. The phi 
coefficient is the product moment correlation coefficient 
for dichotomous data and its calculational formula, in terms 
of response patterns , is 

(++) ( — ) - (-+) (+-) 

v/[(++) + (+-)] [(-+) + ( — )] [(++) + (-+)][(+-) + ( — )] 

PHI has a range of values from -1 to +1, and in this study a 
criterion value of .60. 

5. Phi/Phimax (PPM) . Resnick and Wang (1969) reported 
that Carroll was developing a validation procedure based on 

phi 
phimax 

where phimax is the maximum value which phi could have, 
given the marginals of the contingency table. The 
calculation of PPM has been described by Cureton (1959). 
PPM has a range of values from -1 to +1 and a criterion value 
o£ .60 was used in this study 10 
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6. Difference Ratio (PR) . This ratio was developed for 
this study. A complete description of the development of 
this ratio is given by Durell (-973) . The formula for DR is 

(++) - P{+-) 



(++) + (-+) + {+-) + { — ) + |(-+) - { — )| 



the range of values for DR is from -1 to +1 and the 
criterion value is .50. 

7. Conditional Item Difficulty Index (CIDI) . Airasian 
(1971) proposed a measure which differs from the other 
measures considered in this study. The value of CIDI for 
level n of a hierarchy is given by dividing the number of 
learners who have achieved mastery of all the objectives at 
levels 1 through n by the number of learners who have 
achieved mastery of all objectives at levels 1 through 
(n-1) . The numbering of levels is frora the bottom to the 
top of the hierarchy. For instance, objectives at level 5 
of the hierarchy are prerequisites for objectives at level 
6. CIDI has a range of values from 0 to 1 and a criterion 
value of .85 was used in this study. 

Data Generation Model 

A simplified model of learning hierarchically related 
material was developed. The model served as the basis for a 
computer program to generate simulated data for comparing 
the measures. The model made it possible to consider both 
the apparent state o£ a learner's n stery of a given 



objective and the true underlying state. It is useful to 
distinguish between these conditions by referring to them as 
"indicated mastery" and "true mastery", respectively. 

The model involves three parameters. The first 
parameter is called the "coefficient of transfer" (CT) . CT 
is the probability of a learner having true mastery at a 
particular level of a hierarchy, given true mastery at all 
subordinate levels. In this study, CT was given the values 
.75, .85, and .95 to represent hierarchies demonstrating a 
range from weak to strong transfer. The other two 
parameters are probabilities of indicating mastery. PM is 
the probability that a learner will be judged to have 
mastery of an objective, given that he is in a state of true 
mastery of the objective. PN is the probability that a 
learner will be judged to have mastery of an objective, 
given that he is in a state of true non-mastery of the 
objective. PM was given the values .90 and .95 as 
instructional systems are usually designed to minimize false 
indications of non-mastery. Similarly, PN was given the values 
.05 and .10 as indications of false mastery are also minimized 
in most systems. 

The model was used to generate indicated mastery states 
for each learner for each objective of a hierarchy. It was 
asstmed that all learners had true mastery at the lowest 
level of a hierarchy. The indicated mastery would then be 
generated for the lowest level objective with a probability 



of PM Of indicating mastery. Then the true mastery state 
for the learner on the next objective of the hierarchy was 
generated with a probability of CT of having true mastery. 
For each objective for which a learner had true mastery, the 
indicated mastery state was generated with a probability of 
PM that mastery would be indicated. 

When a state of non-mastery was generated for a 
particular objective of the hierarchy, the indicated mastery 
state was generated with a probability of PN that mastery 
would be indicated. Furthermore, once a learner entered a 
state of true non-mastery for a particular objective, he 
remained in a state of true non-mastery for all higher level 
objectives in strict adherence to the assumptions of 
learning hierarchy theory. Therefore, once a learner 
entered a state of true non-mastery it was no longer 
necessary to use the value of the parameter CT to generate 
the true mastery state for that learner for higher level 
objectives of the hierarchy. 

The indicated mastery states for a learner were paired 
to give a response pattern for each transition from one 
hierarchy level to another. The response patterns were 
tallied over the whole set of simulated learners and used to 
calculate the values of the seven measxires of learning 
hierarchy validity. 

13 



- 12 - 
Method 



The model was used to generate performance data on a 
stochastic basis for groups of 100 simulated learners each. 
The performance data were used to calculate values of each 
of the seven measures of hierarchy validity for each level 
transition in a hierarcny. An eleven-level hierarchy was 
used. Thus there would be ten level transitions in the 
hierarchy and so ten indications by each measure as to 
whether a hierarchical relationship existed between 
successive levels of the hierarchy. The use of the 
simulation made it possible to know the true underlying 
nature of the hierarchy. On that basis r each value of a 
validity measure could be evaluated as indicating a correct 
or incorrect decision concerning existence of a hierarchical 
relationship. Each measure indicated ten decisions. The 
number of correct decisions r which could range from 0 to 10 r 
was the independent variable. 

The parauneters of the model were used as three factors 
in a factorial design. The probability values of the 
parameters were used as levels of the factors. The seven 
measures of hierarchy validity were used as a fourth factor 
with seven levels. Thus the experimental design was 
CTxPMxPNxMeasures which led to a 3x2x2x7 analysis of 
variance. Data were subjected to an arcsin transformation. 
Data for ten groups of 100 simulated subjects each were 
obtained for eaoh of the 84 cells of the design. 
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Foui experiments were performed involving a variety of 
arrangements of the objectives of the hierarchy to test the 
ability of the measures to detect errors in arraragements of 
objectives. In one experiment the hierarchy had the 
objectives in the correct order. Three experiments involved 
incorrect order ings of the objectives. In one, two 
objectives were out of correct order, in another three 
objectives were out of correct order, and in the third the 
objectives were ordered randomly. 

Results 

The analyses of variance carried out on the data for 
the four experiments produced significant effects for the 
Measures factor in every case. That is, there were 
significant differences in the abilities of the seven 
measures to make correct decisions concerning the presence 
of hierarchical relationships in the four different 
orderings of the objectives of a hierarchy. These results 
are summarized in Figure 2. 

The mean numbers of correct decisions made showed that 
PPT and OR are generally less able to indicate correct 
decisions. In addition, the more incorrect the ordering of 
the objectives, the fewer correct decisions PPT and OK made. 
This would indicate that PPT and OR have a tendency to 
indicate the presence of hierarchical relationships which do 
not exist. 

PPM performed slightly better than PPT and OR but not 
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as well as the other four measures. PHI produced moderately 
better results than PPM on the first three experiments and 
substantially better results on the fourth experiment. 

CEW produced a moderate average improvement over PhT on 
the first three experiments and CIDI performed slightly 
better than CEW. DR produced the most consistent results 
overall. 

The more incorrect the ordering of the objectives, the 
more correct decisions CEW and DR made. This would indicate 
that CEW and DR have a tendency to indicate a lack of 
hierarchical relationship even when such relationships might 
exist. 

Further useful information was obtained by examining 
the interactions of the CT and Measures factors. PPT and OR 
tended to make correct decisions for the highest vale of the 
CT factor r but made many incorrect decisions for low values 
of the CT factor. Conversely r CEW and DR made relatively 
few correct decisions at the highest level of CT^ but 
performed quite well at lower values of CT. These 
tendencies are of importance since CT iSr in effect ^ an 
indication of the strength of the hierarchy. Consistent 
trends of this sort were not evident in the CTxMeasures 
interaction data over the four experimonts for the other 
three measures (Durell^ 1973). 
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Discussion 

The tendency of PPT and OR to indicate a large number 
of correct decisions for a well-ordered hierarchy and for 
the high value of CT suggests that these measures are 
"liberal". That is, PPT and OR tend to produce high values 
for a wide range of frequencies of response patterns. This 
means that, for a hierarchy with incorrectly ordered 
objectives and/or low values of CT, PPT and OR will give 
incorrect indications of hierarchical relationship. 

On the other hand, CEW and DR had a tendency to produce 
low values and therefore to indicate that hierarchical 
relationships did not exist. CEW and DR might be 
characterized as "conservative" measures of hierarchy 
validity. These two measures made many correct decisions 
for hierarchies with correctly ordered objectives and/or 
medixim or low CT values. Thus it seems that CEW and DR are 
sufficient to detect many instances of lack of hierarchical 
relationship but do not perform as well in identifying 
instances in which hierarchical relationship does exist. 

PHI, PPM, and CIDI did not have as clearly 
distinguishable characteristics as PPT, OR, CEW, and DR. 
PPM had a general poor ability to make correct decisions. 
PHI had some of the characteristics of CEW and DR, but was 
less effective than those two measures in making correct 
decisions. CIDI was intermediate between CEW and DR in 
overall eUDility to indicate correct decisions. CIDI was 
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less affected by variations in the value of CT than CEW and 
DR. 

The general low ability of PPT, OR, and PPM to make 
correct decisions suggests that they are poor indicators of 
hierarchy validity. PHI performed better, but not as well 
as CEW, CIDI, and DR. In addition, values of PHI are 
somewhat difficult to compute. 

CEW, ^ and CIDI demonstrated reasonable ability to 
make correct decisions. However, all of these three 
measures tended to be better at indicating lack of 
hierarchical relationship than presence of hierarchical 
relationship. Of all the faults which a measure of 
hierarchy validity may have, this tendency to be 
conservative is not a difficult one to deal with. At worst 
it means that a proposed hierarchy will be judged against a 
very stringent criterion. It is suspected that changing the 
criterion value for these measures might lead to an 
improvement in the ability of one or all of them to make 
correct decisions. The task of determining optimum 
criterion values may be approached through further 
simulation studies. 
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